-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a pre-commit script to detect missing i18n implementations #9428
Add a pre-commit script to detect missing i18n implementations #9428
Conversation
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
…dgezero-one/openlibrary into 9423/feat/detect-missing-i18n
So far this appears to be working as expected according to the issue description (except I'm not sure how to get it to only run against PR files instead of all files in the action runner), however my regex approach flags quite a lot of what I think are false positives (example: https://results.pre-commit.ci/run/github/69609/1718241262.NJ_4UJGwTqOunVrs7ivFkA) Some examples:
(I am assuming that this is an indication that the space should be deleted and not that the script should account for this?)
Please let me know what your thoughts are on the above and I'll be happy to make any changes! |
@pidgezero-one This is looking great!!
And re: how to just run on changed files, I believe |
Thank you @rebecca-shoptaw ! I've added some changes that will handle the first four points and will need to think a bit more about how to get the fifth to work. When I run |
@pidgezero-one Great, thank you! Definitely worth waiting for Drini's input on number 5, he has a solid method he thought through for it some time ago and can easily pass along. Very glad it's running against staged files locally, we can investigate re: the CI. |
…dgezero-one/openlibrary into 9423/feat/detect-missing-i18n
That all looks good! For 5, I'd say if the line/char with the |
Thank you for the guidance @rebecca-shoptaw and @cdrini! 😊 I believe the next challenge is that it flags elements which contain punctuation and no words - do we want to go as far as to treat those as errors (or even warnable if encapsulated in |
Wow the results for this are looking so great!! This is going to be so useful 😊 Hmmm can you give some examples? I saw a few with |
Oh also can you try running the script with |
@cdrini So I think this might actually be a trivial problem to solve with the I think that access to that operator could be useful to have in general (I've used it in other projects for regexes that needed to look for non-romance words, for example) but wanted to get your (and @rebecca-shoptaw )'s opinions before doing anything drastic. And when you say running the script with |
Hmm, how would |
entry: python ./scripts/detect_missing_i18n.py | ||
types: [html] | ||
language: python | ||
verbose: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added this so that SKIP
lines will still show even when the exit code is 0. Pre-commit will hide the script output if the exit code is 0.
TIL about |
We only had two files outside of the valid dirs, and it makes this CLI a little cleaner since every specified file will now be processed instead of some silently skipped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok looking good! Ah this is failing locally, the text is just hidden since there's a lot of output. The bug is related to when the errcount is incremented! Should be super close know 😊
Run pre-commit run detect-missing-i18n --all-files
then echo $?
to see the status code.
I'm going to be offline for the next spell but this is the last bug then we should be good to go! Would you mind taking over for me on this one @rebecca-shoptaw ?
scripts/detect_missing_i18n.py
Outdated
elif includes_error_attribute: | ||
char_index = includes_error_attribute.start() | ||
errtype = Errtype.ERR | ||
errcount += 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah this is where the bug is! The errors are being counted even when we continue down below. We should only +1 at the end after we run all the continues.
I think there's still a discrepancy - I'm also referring to the case where every file is skipped. Up until my most recent changes, |
Co-authored-by: Drini Cami <cdrini@gmail.com>
Figured it out - the CI is running the script against files I hadn't yet pulled locally and detecting errors in them. I've added all outstanding files to the exclude list so we can start tackling them. All checks pass now! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Niiiice good thinking! This code lgtm! 😊 @rebecca-shoptaw if you could give it a quick sanity check QA, and then merge, that would be awesome! (I'm afk for a spell 😁)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pidgezero-one Great work on this!!
QA:
Tried running pre-commit run --all-files
with no i18n
formatting changes:
✅ Detect missing i18n check passes
❓ Output for all skipped files appears, but if this only happens for --all-files
which seems to be the case, that's fine with me and could be useful
❓It would be awesome if the skipped files comment was even clearer re: the process to fix the problem(s), i.e. "remove the file from exclude list then run the script again to see what the issue is"
Tried running pre-commit
with a test commit after making an i18n
formatting error in a non-skipped file:
✅ Successfully got an error message when un-i18n
-syntaxing both basic text and complex edge cases
✅ Did not see skipped files output
Tried running pre-commit
with a test commit after adding new non-i18n
-syntaxed text:
✅ Successfully got an error message
✅ Error message disappeared and commit succeeded when I fixed the syntax
Looks great to me! I'm happy to approve a merge, and I'm thinking that once it's merged I'll assign myself a deep dive into those skipped files to try to fix as many as I can, and can bundle a minor instructions clarification into that. 🙂
Thank you @rebecca-shoptaw ! 😊
Yep, this is the case! Regardless of if a file is skipped or errored, it'll only show if that file was passed to the script, so without --all-files it would only apply to staged files.
By "skipped files comment" what does this refer to? A comment in the code, or the script's output, or the instructions I put in the wiki? I'll be happy to address that ASAP so it can be merged - unfortunately the longer this PR remains open, the more updates I'll have to make to it as other PRs are merged with HTML changes that will set off the CI 😵💫 |
@pidgezero-one No need to make any more changes! I just meant the comment in the code itself (added with "explain exclude list"), and I'll tweak it if necessary when I do my new/related PR to fix up those excluded files 🙂 I've approved the changes, we just need a staff member with merge powers to go ahead and merge |
@rebecca-shoptaw Oh okay cool! Thank you so much! 😄 |
Woohoo!! Thank you so much folks!! |
Closes #9423
Technical
Parses the contents of html files in specific directories to detect the line and column numbers for tags that might be missing i18n implementation.
Mostly a series of loops to ensure that the provided indexes are correct.
Testing
I've been adding junk data to html files and running both
pre-commit run
andpre-commit run --all-files
, and it looks like the scope of those two different commands is correct. However, the regex approach returns quite a lot of what I think are false positives.Screenshot
Stakeholders
@rebecca-shoptaw